Serveur d'exploration sur l'OCR

Attention, ce site est en cours de développement !
Attention, site généré par des moyens informatiques à partir de corpus bruts.
Les informations ne sont donc pas validées.

Fast Structured Prediction Using Large Margin Sigmoid Belief Networks

Identifieur interne : 000315 ( Main/Exploration ); précédent : 000314; suivant : 000316

Fast Structured Prediction Using Large Margin Sigmoid Belief Networks

Auteurs : XU MIAO [États-Unis] ; Rajesh P. N. Rao [États-Unis]

Source :

RBID : Pascal:12-0330439

Descripteurs français

English descriptors

Abstract

Images usually contain multiple objects that are semantically related to one another. Mapping from low-level visual features to mutually dependent high-level semantics can be formulated as a structured prediction problem. Current statistical models for structured prediction make simplifying assumptions about the underlying output graph structure, such as assuming a low-order Markov chain, because exact inference becomes intractable as the tree-width of the underlying graph increases. Approximate inference algorithms, on the other hand, force one to trade off representational power with computational efficiency. In this paper, we present large margin sigmoid belief networks (LMSBNs) for structured prediction in images. LMSBNs allow a very fast inference algorithm for arbitrary graph structures that runs in polynomial time with high probability. This probability is data-distribution dependent and is maximized in learning. The new approach overcomes the representation-efficiency trade-off in previous models and allows fast structured prediction with complicated graph structures. We present results from applying a fully connected model to semantic image annotation, image retrieval and optical character recognition (OCR) problems, and demonstrate that the proposed approach can yield significant performance gains over current state-of-the-art methods.


Affiliations:


Links toward previous steps (curation, corpus...)


Le document en format XML

<record>
<TEI>
<teiHeader>
<fileDesc>
<titleStmt>
<title xml:lang="en" level="a">Fast Structured Prediction Using Large Margin Sigmoid Belief Networks</title>
<author>
<name sortKey="Xu Miao" sort="Xu Miao" uniqKey="Xu Miao" last="Xu Miao">XU MIAO</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Computer Science and Engineering, University of Washington</s1>
<s2>Seattle, WA 98125</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Washington (État)</region>
<settlement type="city">Seattle</settlement>
</placeName>
<orgName type="university">Université de Washington</orgName>
</affiliation>
</author>
<author>
<name sortKey="Rao, Rajesh P N" sort="Rao, Rajesh P N" uniqKey="Rao R" first="Rajesh P. N." last="Rao">Rajesh P. N. Rao</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Computer Science and Engineering, University of Washington</s1>
<s2>Seattle, WA 98125</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Washington (État)</region>
<settlement type="city">Seattle</settlement>
</placeName>
<orgName type="university">Université de Washington</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt>
<idno type="wicri:source">INIST</idno>
<idno type="inist">12-0330439</idno>
<date when="2012">2012</date>
<idno type="stanalyst">PASCAL 12-0330439 INIST</idno>
<idno type="RBID">Pascal:12-0330439</idno>
<idno type="wicri:Area/PascalFrancis/Corpus">000083</idno>
<idno type="wicri:Area/PascalFrancis/Curation">000689</idno>
<idno type="wicri:Area/PascalFrancis/Checkpoint">000077</idno>
<idno type="wicri:doubleKey">0920-5691:2012:Xu Miao:fast:structured:prediction</idno>
<idno type="wicri:Area/Main/Merge">000318</idno>
<idno type="wicri:Area/Main/Curation">000315</idno>
<idno type="wicri:Area/Main/Exploration">000315</idno>
</publicationStmt>
<sourceDesc>
<biblStruct>
<analytic>
<title xml:lang="en" level="a">Fast Structured Prediction Using Large Margin Sigmoid Belief Networks</title>
<author>
<name sortKey="Xu Miao" sort="Xu Miao" uniqKey="Xu Miao" last="Xu Miao">XU MIAO</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Computer Science and Engineering, University of Washington</s1>
<s2>Seattle, WA 98125</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Washington (État)</region>
<settlement type="city">Seattle</settlement>
</placeName>
<orgName type="university">Université de Washington</orgName>
</affiliation>
</author>
<author>
<name sortKey="Rao, Rajesh P N" sort="Rao, Rajesh P N" uniqKey="Rao R" first="Rajesh P. N." last="Rao">Rajesh P. N. Rao</name>
<affiliation wicri:level="4">
<inist:fA14 i1="01">
<s1>Computer Science and Engineering, University of Washington</s1>
<s2>Seattle, WA 98125</s2>
<s3>USA</s3>
<sZ>1 aut.</sZ>
<sZ>2 aut.</sZ>
</inist:fA14>
<country>États-Unis</country>
<placeName>
<region type="state">Washington (État)</region>
<settlement type="city">Seattle</settlement>
</placeName>
<orgName type="university">Université de Washington</orgName>
</affiliation>
</author>
</analytic>
<series>
<title level="j" type="main">International journal of computer vision</title>
<title level="j" type="abbreviated">Int. j. comput. vis.</title>
<idno type="ISSN">0920-5691</idno>
<imprint>
<date when="2012">2012</date>
</imprint>
</series>
</biblStruct>
</sourceDesc>
<seriesStmt>
<title level="j" type="main">International journal of computer vision</title>
<title level="j" type="abbreviated">Int. j. comput. vis.</title>
<idno type="ISSN">0920-5691</idno>
</seriesStmt>
</fileDesc>
<profileDesc>
<textClass>
<keywords scheme="KwdEn" xml:lang="en">
<term>Bayes network</term>
<term>Belief networks</term>
<term>Character recognition</term>
<term>Classification</term>
<term>Data distribution</term>
<term>Data structure</term>
<term>Efficiency</term>
<term>Energy consumption</term>
<term>Fast algorithm</term>
<term>Image processing</term>
<term>Image retrieval</term>
<term>Indexing</term>
<term>Inference</term>
<term>Information retrieval</term>
<term>Internal structure</term>
<term>Learning algorithm</term>
<term>Markov chain</term>
<term>Modeling</term>
<term>Multiple image</term>
<term>Optical character recognition</term>
<term>Optical image</term>
<term>Polynomial time</term>
<term>Probabilistic approach</term>
<term>Probability distribution</term>
<term>Semantics</term>
<term>Statistical analysis</term>
<term>Treewidth</term>
</keywords>
<keywords scheme="Pascal" xml:lang="fr">
<term>Classification</term>
<term>Structure donnée</term>
<term>Image multiple</term>
<term>Inférence</term>
<term>Temps polynomial</term>
<term>Recherche information</term>
<term>Indexation</term>
<term>Traitement image</term>
<term>Image optique</term>
<term>Reconnaissance optique caractère</term>
<term>Reconnaissance caractère</term>
<term>Réseau croyance</term>
<term>Sémantique</term>
<term>Largeur arborescente</term>
<term>Consommation énergie</term>
<term>Distribution donnée</term>
<term>Structure interne</term>
<term>Algorithme apprentissage</term>
<term>Analyse statistique</term>
<term>Modélisation</term>
<term>Chaîne Markov</term>
<term>Algorithme rapide</term>
<term>Approche probabiliste</term>
<term>Loi probabilité</term>
<term>Efficacité</term>
<term>Réseau Bayes</term>
<term>.</term>
<term>Recherche image</term>
</keywords>
<keywords scheme="Wicri" type="topic" xml:lang="fr">
<term>Classification</term>
</keywords>
</textClass>
</profileDesc>
</teiHeader>
<front>
<div type="abstract" xml:lang="en">Images usually contain multiple objects that are semantically related to one another. Mapping from low-level visual features to mutually dependent high-level semantics can be formulated as a structured prediction problem. Current statistical models for structured prediction make simplifying assumptions about the underlying output graph structure, such as assuming a low-order Markov chain, because exact inference becomes intractable as the tree-width of the underlying graph increases. Approximate inference algorithms, on the other hand, force one to trade off representational power with computational efficiency. In this paper, we present large margin sigmoid belief networks (LMSBNs) for structured prediction in images. LMSBNs allow a very fast inference algorithm for arbitrary graph structures that runs in polynomial time with high probability. This probability is data-distribution dependent and is maximized in learning. The new approach overcomes the representation-efficiency trade-off in previous models and allows fast structured prediction with complicated graph structures. We present results from applying a fully connected model to semantic image annotation, image retrieval and optical character recognition (OCR) problems, and demonstrate that the proposed approach can yield significant performance gains over current state-of-the-art methods.</div>
</front>
</TEI>
<affiliations>
<list>
<country>
<li>États-Unis</li>
</country>
<region>
<li>Washington (État)</li>
</region>
<settlement>
<li>Seattle</li>
</settlement>
<orgName>
<li>Université de Washington</li>
</orgName>
</list>
<tree>
<country name="États-Unis">
<region name="Washington (État)">
<name sortKey="Xu Miao" sort="Xu Miao" uniqKey="Xu Miao" last="Xu Miao">XU MIAO</name>
</region>
<name sortKey="Rao, Rajesh P N" sort="Rao, Rajesh P N" uniqKey="Rao R" first="Rajesh P. N." last="Rao">Rajesh P. N. Rao</name>
</country>
</tree>
</affiliations>
</record>

Pour manipuler ce document sous Unix (Dilib)

EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000315 | SxmlIndent | more

Ou

HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000315 | SxmlIndent | more

Pour mettre un lien sur cette page dans le réseau Wicri

{{Explor lien
   |wiki=    Ticri/CIDE
   |area=    OcrV1
   |flux=    Main
   |étape=   Exploration
   |type=    RBID
   |clé=     Pascal:12-0330439
   |texte=   Fast Structured Prediction Using Large Margin Sigmoid Belief Networks
}}

Wicri

This area was generated with Dilib version V0.6.32.
Data generation: Sat Nov 11 16:53:45 2017. Site generation: Mon Mar 11 23:15:16 2024